Overview

Dataset statistics

Number of variables16
Number of observations4240
Missing cells645
Missing cells (%)1.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory530.1 KiB
Average record size in memory128.0 B

Variable types

Categorical8
Numeric8

Alerts

cigsPerDay is highly overall correlated with currentSmokerHigh correlation
sysBP is highly overall correlated with diaBP and 1 other fieldsHigh correlation
diaBP is highly overall correlated with sysBP and 1 other fieldsHigh correlation
glucose is highly overall correlated with diabetesHigh correlation
currentSmoker is highly overall correlated with cigsPerDayHigh correlation
prevalentHyp is highly overall correlated with sysBP and 1 other fieldsHigh correlation
diabetes is highly overall correlated with glucoseHigh correlation
BPMeds is highly imbalanced (80.8%)Imbalance
prevalentStroke is highly imbalanced (94.8%)Imbalance
diabetes is highly imbalanced (82.8%)Imbalance
education has 105 (2.5%) missing valuesMissing
BPMeds has 53 (1.2%) missing valuesMissing
totChol has 50 (1.2%) missing valuesMissing
glucose has 388 (9.2%) missing valuesMissing
cigsPerDay has 2145 (50.6%) zerosZeros

Reproduction

Analysis started2023-02-14 04:29:12.842246
Analysis finished2023-02-14 04:29:24.107220
Duration11.26 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

male
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.2 KiB
0
2420 
1
1820 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4240
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2420
57.1%
1 1820
42.9%

Length

2023-02-14T09:59:24.183071image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:24.325610image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 2420
57.1%
1 1820
42.9%

Most occurring characters

ValueCountFrequency (%)
0 2420
57.1%
1 1820
42.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4240
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2420
57.1%
1 1820
42.9%

Most occurring scripts

ValueCountFrequency (%)
Common 4240
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2420
57.1%
1 1820
42.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2420
57.1%
1 1820
42.9%

age
Real number (ℝ)

Distinct39
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.580189
Minimum32
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:24.449872image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum32
5-th percentile37
Q142
median49
Q356
95-th percentile64
Maximum70
Range38
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.5729422
Coefficient of variation (CV)0.17291064
Kurtosis-0.98989501
Mean49.580189
Median Absolute Deviation (MAD)7
Skewness0.22886703
Sum210220
Variance73.495338
MonotonicityNot monotonic
2023-02-14T09:59:24.585782image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
40 192
 
4.5%
46 182
 
4.3%
42 180
 
4.2%
41 174
 
4.1%
48 173
 
4.1%
39 170
 
4.0%
44 166
 
3.9%
45 162
 
3.8%
43 159
 
3.8%
52 149
 
3.5%
Other values (29) 2533
59.7%
ValueCountFrequency (%)
32 1
 
< 0.1%
33 5
 
0.1%
34 18
 
0.4%
35 42
 
1.0%
36 84
2.0%
37 92
2.2%
38 144
3.4%
39 170
4.0%
40 192
4.5%
41 174
4.1%
ValueCountFrequency (%)
70 2
 
< 0.1%
69 7
 
0.2%
68 18
 
0.4%
67 45
1.1%
66 38
 
0.9%
65 57
1.3%
64 93
2.2%
63 110
2.6%
62 99
2.3%
61 110
2.6%

education
Categorical

Distinct4
Distinct (%)0.1%
Missing105
Missing (%)2.5%
Memory size33.2 KiB
1.0
1720 
2.0
1253 
3.0
689 
4.0
473 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters12405
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4.0
2nd row2.0
3rd row1.0
4th row3.0
5th row3.0

Common Values

ValueCountFrequency (%)
1.0 1720
40.6%
2.0 1253
29.6%
3.0 689
16.2%
4.0 473
 
11.2%
(Missing) 105
 
2.5%

Length

2023-02-14T09:59:24.722326image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:24.854304image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 1720
41.6%
2.0 1253
30.3%
3.0 689
16.7%
4.0 473
 
11.4%

Most occurring characters

ValueCountFrequency (%)
. 4135
33.3%
0 4135
33.3%
1 1720
13.9%
2 1253
 
10.1%
3 689
 
5.6%
4 473
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8270
66.7%
Other Punctuation 4135
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4135
50.0%
1 1720
20.8%
2 1253
 
15.2%
3 689
 
8.3%
4 473
 
5.7%
Other Punctuation
ValueCountFrequency (%)
. 4135
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12405
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 4135
33.3%
0 4135
33.3%
1 1720
13.9%
2 1253
 
10.1%
3 689
 
5.6%
4 473
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12405
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 4135
33.3%
0 4135
33.3%
1 1720
13.9%
2 1253
 
10.1%
3 689
 
5.6%
4 473
 
3.8%

currentSmoker
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.2 KiB
0
2145 
1
2095 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4240
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0 2145
50.6%
1 2095
49.4%

Length

2023-02-14T09:59:24.967638image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:25.083009image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 2145
50.6%
1 2095
49.4%

Most occurring characters

ValueCountFrequency (%)
0 2145
50.6%
1 2095
49.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4240
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2145
50.6%
1 2095
49.4%

Most occurring scripts

ValueCountFrequency (%)
Common 4240
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2145
50.6%
1 2095
49.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2145
50.6%
1 2095
49.4%

cigsPerDay
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct33
Distinct (%)0.8%
Missing29
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean9.0059368
Minimum0
Maximum70
Zeros2145
Zeros (%)50.6%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:25.219108image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320
95-th percentile30
Maximum70
Range70
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.922462
Coefficient of variation (CV)1.3238447
Kurtosis1.0194182
Mean9.0059368
Median Absolute Deviation (MAD)0
Skewness1.2470524
Sum37924
Variance142.1451
MonotonicityNot monotonic
2023-02-14T09:59:25.330973image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
0 2145
50.6%
20 734
 
17.3%
30 218
 
5.1%
15 210
 
5.0%
10 143
 
3.4%
9 130
 
3.1%
5 121
 
2.9%
3 100
 
2.4%
40 80
 
1.9%
1 67
 
1.6%
Other values (23) 263
 
6.2%
ValueCountFrequency (%)
0 2145
50.6%
1 67
 
1.6%
2 18
 
0.4%
3 100
 
2.4%
4 9
 
0.2%
5 121
 
2.9%
6 18
 
0.4%
7 12
 
0.3%
8 11
 
0.3%
9 130
 
3.1%
ValueCountFrequency (%)
70 1
 
< 0.1%
60 11
 
0.3%
50 6
 
0.1%
45 3
 
0.1%
43 56
 
1.3%
40 80
 
1.9%
38 1
 
< 0.1%
35 22
 
0.5%
30 218
5.1%
29 1
 
< 0.1%

BPMeds
Categorical

IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing53
Missing (%)1.2%
Memory size33.2 KiB
0.0
4063 
1.0
 
124

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters12561
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 4063
95.8%
1.0 124
 
2.9%
(Missing) 53
 
1.2%

Length

2023-02-14T09:59:25.457168image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:25.563223image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 4063
97.0%
1.0 124
 
3.0%

Most occurring characters

ValueCountFrequency (%)
0 8250
65.7%
. 4187
33.3%
1 124
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8374
66.7%
Other Punctuation 4187
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8250
98.5%
1 124
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 4187
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12561
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8250
65.7%
. 4187
33.3%
1 124
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12561
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8250
65.7%
. 4187
33.3%
1 124
 
1.0%

prevalentStroke
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.2 KiB
0
4215 
1
 
25

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4240
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 4215
99.4%
1 25
 
0.6%

Length

2023-02-14T09:59:25.652788image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:25.764139image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 4215
99.4%
1 25
 
0.6%

Most occurring characters

ValueCountFrequency (%)
0 4215
99.4%
1 25
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4240
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4215
99.4%
1 25
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4240
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4215
99.4%
1 25
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4215
99.4%
1 25
 
0.6%

prevalentHyp
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.2 KiB
0
2923 
1
1317 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4240
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 2923
68.9%
1 1317
31.1%

Length

2023-02-14T09:59:25.866845image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:25.972214image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 2923
68.9%
1 1317
31.1%

Most occurring characters

ValueCountFrequency (%)
0 2923
68.9%
1 1317
31.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4240
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2923
68.9%
1 1317
31.1%

Most occurring scripts

ValueCountFrequency (%)
Common 4240
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2923
68.9%
1 1317
31.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2923
68.9%
1 1317
31.1%

diabetes
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.2 KiB
0
4131 
1
 
109

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4240
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 4131
97.4%
1 109
 
2.6%

Length

2023-02-14T09:59:26.095858image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:26.203238image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 4131
97.4%
1 109
 
2.6%

Most occurring characters

ValueCountFrequency (%)
0 4131
97.4%
1 109
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4240
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4131
97.4%
1 109
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Common 4240
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4131
97.4%
1 109
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4131
97.4%
1 109
 
2.6%

totChol
Real number (ℝ)

Distinct248
Distinct (%)5.9%
Missing50
Missing (%)1.2%
Infinite0
Infinite (%)0.0%
Mean236.69952
Minimum107
Maximum696
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:26.311186image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum107
5-th percentile170
Q1206
median234
Q3263
95-th percentile312
Maximum696
Range589
Interquartile range (IQR)57

Descriptive statistics

Standard deviation44.591284
Coefficient of variation (CV)0.18838772
Kurtosis4.1298894
Mean236.69952
Median Absolute Deviation (MAD)29
Skewness0.87188056
Sum991771
Variance1988.3826
MonotonicityNot monotonic
2023-02-14T09:59:26.441667image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
240 85
 
2.0%
220 70
 
1.7%
260 62
 
1.5%
210 61
 
1.4%
232 59
 
1.4%
250 57
 
1.3%
200 56
 
1.3%
225 54
 
1.3%
230 54
 
1.3%
205 53
 
1.2%
Other values (238) 3579
84.4%
(Missing) 50
 
1.2%
ValueCountFrequency (%)
107 1
< 0.1%
113 1
< 0.1%
119 1
< 0.1%
124 1
< 0.1%
126 1
< 0.1%
129 1
< 0.1%
133 1
< 0.1%
135 2
< 0.1%
137 1
< 0.1%
140 2
< 0.1%
ValueCountFrequency (%)
696 1
 
< 0.1%
600 1
 
< 0.1%
464 1
 
< 0.1%
453 1
 
< 0.1%
439 1
 
< 0.1%
432 1
 
< 0.1%
410 3
0.1%
405 1
 
< 0.1%
398 1
 
< 0.1%
392 1
 
< 0.1%

sysBP
Real number (ℝ)

Distinct234
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean132.3546
Minimum83.5
Maximum295
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:26.605565image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum83.5
5-th percentile104
Q1117
median128
Q3144
95-th percentile175
Maximum295
Range211.5
Interquartile range (IQR)27

Descriptive statistics

Standard deviation22.0333
Coefficient of variation (CV)0.16647173
Kurtosis2.1566236
Mean132.3546
Median Absolute Deviation (MAD)13
Skewness1.145285
Sum561183.5
Variance485.46629
MonotonicityNot monotonic
2023-02-14T09:59:26.754226image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120 107
 
2.5%
130 102
 
2.4%
110 96
 
2.3%
115 89
 
2.1%
125 88
 
2.1%
124 84
 
2.0%
122 80
 
1.9%
126 73
 
1.7%
128 73
 
1.7%
123 72
 
1.7%
Other values (224) 3376
79.6%
ValueCountFrequency (%)
83.5 2
 
< 0.1%
85 1
 
< 0.1%
85.5 1
 
< 0.1%
90 2
 
< 0.1%
92 1
 
< 0.1%
92.5 2
 
< 0.1%
93 2
 
< 0.1%
93.5 2
 
< 0.1%
94 3
0.1%
95 7
0.2%
ValueCountFrequency (%)
295 1
 
< 0.1%
248 1
 
< 0.1%
244 1
 
< 0.1%
243 1
 
< 0.1%
235 1
 
< 0.1%
232 1
 
< 0.1%
230 1
 
< 0.1%
220 2
< 0.1%
217 1
 
< 0.1%
215 3
0.1%

diaBP
Real number (ℝ)

Distinct146
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82.897759
Minimum48
Maximum142.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:26.884078image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile66
Q175
median82
Q390
95-th percentile104.525
Maximum142.5
Range94.5
Interquartile range (IQR)15

Descriptive statistics

Standard deviation11.910394
Coefficient of variation (CV)0.14367571
Kurtosis1.2753143
Mean82.897759
Median Absolute Deviation (MAD)7.5
Skewness0.71325021
Sum351486.5
Variance141.8575
MonotonicityNot monotonic
2023-02-14T09:59:27.043627image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80 262
 
6.2%
82 152
 
3.6%
85 137
 
3.2%
70 135
 
3.2%
81 131
 
3.1%
84 122
 
2.9%
90 119
 
2.8%
78 116
 
2.7%
87 113
 
2.7%
86 108
 
2.5%
Other values (136) 2845
67.1%
ValueCountFrequency (%)
48 1
 
< 0.1%
50 1
 
< 0.1%
51 1
 
< 0.1%
52 2
 
< 0.1%
53 1
 
< 0.1%
54 1
 
< 0.1%
55 3
0.1%
56 2
 
< 0.1%
57 6
0.1%
57.5 3
0.1%
ValueCountFrequency (%)
142.5 1
 
< 0.1%
140 1
 
< 0.1%
136 2
 
< 0.1%
135 2
 
< 0.1%
133 2
 
< 0.1%
132 1
 
< 0.1%
130 5
0.1%
129 1
 
< 0.1%
128 1
 
< 0.1%
127.5 1
 
< 0.1%

BMI
Real number (ℝ)

Distinct1364
Distinct (%)32.3%
Missing19
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean25.800801
Minimum15.54
Maximum56.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:27.200650image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum15.54
5-th percentile20.06
Q123.07
median25.4
Q328.04
95-th percentile32.78
Maximum56.8
Range41.26
Interquartile range (IQR)4.97

Descriptive statistics

Standard deviation4.0798402
Coefficient of variation (CV)0.15812843
Kurtosis2.6573096
Mean25.800801
Median Absolute Deviation (MAD)2.49
Skewness0.9821833
Sum108905.18
Variance16.645096
MonotonicityNot monotonic
2023-02-14T09:59:27.324685image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22.91 18
 
0.4%
22.54 18
 
0.4%
23.48 18
 
0.4%
22.19 18
 
0.4%
23.09 16
 
0.4%
25.09 16
 
0.4%
23.1 13
 
0.3%
22.73 13
 
0.3%
25.23 13
 
0.3%
27.78 12
 
0.3%
Other values (1354) 4066
95.9%
(Missing) 19
 
0.4%
ValueCountFrequency (%)
15.54 1
< 0.1%
15.96 1
< 0.1%
16.48 1
< 0.1%
16.59 2
< 0.1%
16.61 1
< 0.1%
16.69 1
< 0.1%
16.71 1
< 0.1%
16.73 1
< 0.1%
16.75 1
< 0.1%
16.87 1
< 0.1%
ValueCountFrequency (%)
56.8 1
< 0.1%
51.28 1
< 0.1%
45.8 1
< 0.1%
45.79 1
< 0.1%
44.71 1
< 0.1%
44.55 1
< 0.1%
44.27 1
< 0.1%
44.09 1
< 0.1%
43.69 1
< 0.1%
43.67 1
< 0.1%

heartRate
Real number (ℝ)

Distinct73
Distinct (%)1.7%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean75.878981
Minimum44
Maximum143
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:27.477659image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum44
5-th percentile60
Q168
median75
Q383
95-th percentile98
Maximum143
Range99
Interquartile range (IQR)15

Descriptive statistics

Standard deviation12.025348
Coefficient of variation (CV)0.15848062
Kurtosis0.90739572
Mean75.878981
Median Absolute Deviation (MAD)7
Skewness0.64437182
Sum321651
Variance144.60899
MonotonicityNot monotonic
2023-02-14T09:59:27.622586image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75 563
 
13.3%
80 385
 
9.1%
70 305
 
7.2%
60 231
 
5.4%
85 228
 
5.4%
72 222
 
5.2%
65 197
 
4.6%
90 172
 
4.1%
68 151
 
3.6%
100 98
 
2.3%
Other values (63) 1687
39.8%
ValueCountFrequency (%)
44 1
 
< 0.1%
45 2
 
< 0.1%
46 1
 
< 0.1%
47 1
 
< 0.1%
48 5
 
0.1%
50 22
0.5%
51 1
 
< 0.1%
52 17
0.4%
53 11
0.3%
54 12
0.3%
ValueCountFrequency (%)
143 1
 
< 0.1%
140 1
 
< 0.1%
130 1
 
< 0.1%
125 3
 
0.1%
122 2
 
< 0.1%
120 7
 
0.2%
115 5
 
0.1%
112 3
 
0.1%
110 36
0.8%
108 8
 
0.2%

glucose
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct143
Distinct (%)3.7%
Missing388
Missing (%)9.2%
Infinite0
Infinite (%)0.0%
Mean81.963655
Minimum40
Maximum394
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size33.2 KiB
2023-02-14T09:59:27.950957image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum40
5-th percentile62
Q171
median78
Q387
95-th percentile108.45
Maximum394
Range354
Interquartile range (IQR)16

Descriptive statistics

Standard deviation23.954335
Coefficient of variation (CV)0.29225557
Kurtosis58.703741
Mean81.963655
Median Absolute Deviation (MAD)8
Skewness6.2149483
Sum315724
Variance573.81016
MonotonicityNot monotonic
2023-02-14T09:59:28.085837image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75 193
 
4.6%
77 167
 
3.9%
73 156
 
3.7%
80 153
 
3.6%
70 152
 
3.6%
83 151
 
3.6%
78 148
 
3.5%
74 141
 
3.3%
85 127
 
3.0%
76 127
 
3.0%
Other values (133) 2337
55.1%
(Missing) 388
 
9.2%
ValueCountFrequency (%)
40 2
 
< 0.1%
43 1
 
< 0.1%
44 2
 
< 0.1%
45 4
0.1%
47 3
0.1%
48 1
 
< 0.1%
50 3
0.1%
52 2
 
< 0.1%
53 5
0.1%
54 5
0.1%
ValueCountFrequency (%)
394 2
< 0.1%
386 1
< 0.1%
370 1
< 0.1%
368 1
< 0.1%
348 1
< 0.1%
332 1
< 0.1%
325 1
< 0.1%
320 1
< 0.1%
297 1
< 0.1%
294 1
< 0.1%

target
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size33.2 KiB
0
3596 
1
644 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4240
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 3596
84.8%
1 644
 
15.2%

Length

2023-02-14T09:59:28.222287image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-14T09:59:28.350141image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 3596
84.8%
1 644
 
15.2%

Most occurring characters

ValueCountFrequency (%)
0 3596
84.8%
1 644
 
15.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4240
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3596
84.8%
1 644
 
15.2%

Most occurring scripts

ValueCountFrequency (%)
Common 4240
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3596
84.8%
1 644
 
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4240
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3596
84.8%
1 644
 
15.2%

Interactions

2023-02-14T09:59:22.194382image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:15.220417image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.293358image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:17.449240image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.404192image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.383052image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.307351image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.229148image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:22.328050image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:15.361909image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.424633image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:17.571253image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.538007image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.498941image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.440317image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.355770image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:22.447434image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:15.508272image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.587320image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:17.693263image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.672555image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.614559image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.557395image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.485499image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:22.571080image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:15.637470image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.717511image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:17.822147image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.793291image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.732573image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.681660image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.607820image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:22.688971image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:15.761586image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.830570image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:17.931692image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.916290image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.857308image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.794912image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.722431image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:22.805012image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:15.906765image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.945201image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.046615image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.017887image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.968666image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.908120image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.843436image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:22.914449image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.035133image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:17.048784image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.155750image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.123113image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.065601image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.011647image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.951540image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:23.039881image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:16.160682image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:17.315606image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:18.282759image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:19.247443image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:20.175442image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:21.117843image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-02-14T09:59:22.073115image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-02-14T09:59:28.467460image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
agecigsPerDaytotCholsysBPdiaBPBMIheartRateglucosemaleeducationcurrentSmokerBPMedsprevalentStrokeprevalentHypdiabetestarget
age1.000-0.2150.2890.3910.2080.145-0.0150.1160.0070.1450.2260.1320.0640.3010.0980.221
cigsPerDay-0.2151.000-0.041-0.111-0.089-0.1410.079-0.0900.3250.0420.8460.0230.0000.1040.0000.058
totChol0.289-0.0411.0000.2240.1850.1480.0890.0300.0830.0220.0390.0690.0000.1590.0900.077
sysBP0.391-0.1110.2241.0000.7780.3240.1710.1170.1050.0720.1230.2830.0570.7090.1280.210
diaBP0.208-0.0890.1850.7781.0000.3750.1790.0470.0690.0460.1100.2150.0400.6380.0500.156
BMI0.145-0.1410.1480.3240.3751.0000.0560.0710.2080.0840.1560.1360.2090.2930.1030.087
heartRate-0.0150.0790.0890.1710.1790.0561.0000.0970.1150.0430.0740.0690.0000.1400.0460.000
glucose0.116-0.0900.0300.1170.0470.0710.0971.0000.0000.0320.0820.0850.0310.0870.7180.123
male0.0070.3250.0830.1050.0690.2080.1150.0001.0000.1430.1960.0490.0000.0000.0000.086
education0.1450.0420.0220.0720.0460.0840.0430.0320.1431.0000.0610.0000.0240.0890.0420.084
currentSmoker0.2260.8460.0390.1230.1100.1560.0740.0820.1960.0611.0000.0450.0260.1020.0400.011
BPMeds0.1320.0230.0690.2830.2150.1360.0690.0850.0490.0000.0451.0000.1070.2590.0450.084
prevalentStroke0.0640.0000.0000.0570.0400.2090.0000.0310.0000.0240.0260.1071.0000.0700.0000.055
prevalentHyp0.3010.1040.1590.7090.6380.2930.1400.0870.0000.0890.1020.2590.0701.0000.0750.176
diabetes0.0980.0000.0900.1280.0500.1030.0460.7180.0000.0420.0400.0450.0000.0751.0000.094
target0.2210.0580.0770.2100.1560.0870.0000.1230.0860.0840.0110.0840.0550.1760.0941.000

Missing values

2023-02-14T09:59:23.238276image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-02-14T09:59:23.750601image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-02-14T09:59:23.984726image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

maleageeducationcurrentSmokercigsPerDayBPMedsprevalentStrokeprevalentHypdiabetestotCholsysBPdiaBPBMIheartRateglucosetarget
01394.000.00.0000195.0106.070.026.9780.077.00
10462.000.00.0000250.0121.081.028.7395.076.00
21481.0120.00.0000245.0127.580.025.3475.070.00
30613.0130.00.0010225.0150.095.028.5865.0103.01
40463.0123.00.0000285.0130.084.023.1085.085.00
50432.000.00.0010228.0180.0110.030.3077.099.00
60631.000.00.0000205.0138.071.033.1160.085.01
70452.0120.00.0000313.0100.071.021.6879.078.00
81521.000.00.0010260.0141.589.026.3676.079.00
91431.0130.00.0010225.0162.0107.023.6193.088.00
maleageeducationcurrentSmokercigsPerDayBPMedsprevalentStrokeprevalentHypdiabetestotCholsysBPdiaBPBMIheartRateglucosetarget
42300561.013.00.0010268.0170.0102.022.8957.0NaN0
42311583.000.00.0010187.0141.081.024.9680.081.00
42321681.000.00.0010176.0168.097.023.1460.079.01
42331501.011.00.0010313.0179.092.025.9766.086.01
42341513.0143.00.0000207.0126.580.019.7165.068.00
42350482.0120.0NaN000248.0131.072.022.0084.086.00
42360441.0115.00.0000210.0126.587.019.1686.0NaN0
42370522.000.00.0000269.0133.583.021.4780.0107.00
42381403.000.00.0010185.0141.098.025.6067.072.00
42390393.0130.00.0000196.0133.086.020.9185.080.00